Uniform Distribution — Bounded Randomness#
The continuous uniform distribution is the simplest model for a quantity that can take any value within a known interval and is equally likely across that interval.
It shows up as:
a building block for simulation (inverse-CDF / transforms)
the canonical distribution for p-values under a true null: \(p \sim \mathrm{Uniform}(0,1)\)
the maximum-entropy distribution on a bounded interval (no other information)
What you’ll learn#
definition (PDF/CDF), support, and parameter constraints
closed-form moments, MGF/CF, and entropy
MLE / likelihood geometry (why the MLE hits min/max)
NumPy-only sampling and basic visual diagnostics
how SciPy parameterizes
scipy.stats.uniform
import numpy as np
import scipy
import plotly
from scipy import stats
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook") # CKC convention
SEED = 7
rng = np.random.default_rng(SEED)
np.set_printoptions(precision=4, suppress=True)
print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
print("plotly", plotly.__version__)
numpy 1.26.2
scipy 1.15.0
plotly 6.5.2
1) Title & Classification#
Name:
uniform(continuous uniform distribution)Type: Continuous
Support: \(x \in [a, b]\)
Parameter space: \(a,b \in \mathbb{R}\) with \(a < b\)
We write:
Library note (SciPy): scipy.stats.uniform uses parameters (loc, scale) with support \(x \in [\mathrm{loc},\, \mathrm{loc}+\mathrm{scale}]\) and constraint \(\mathrm{scale}>0\).
2) Intuition & Motivation#
2.1 What it models#
Use a uniform distribution when you only know that a quantity lies in a bounded interval \([a,b]\) and you have no reason to prefer any sub-interval.
Equivalently: among all continuous distributions supported on \([a,b]\), the uniform has maximum differential entropy.
2.2 Typical real-world use cases#
Randomized experiments: random assignment, random offsets, random jitter
Simulation / Monte Carlo: base source of randomness used to generate other distributions
Quality control: tolerances where any value in a band is “equally plausible”
P-values under \(H_0\): if a test is valid and the null is true, \(p \sim \mathrm{Uniform}(0,1)\)
2.3 Relations to other distributions#
Beta: \(\mathrm{Uniform}(0,1) = \mathrm{Beta}(1,1)\)
Order statistics: the sample min/max have Beta-distributed rescalings
Transforms: if \(U\sim\mathrm{Uniform}(0,1)\) and \(X = F^{-1}(U)\), then \(X\) has CDF \(F\) (inverse transform sampling)
Sums/averages: sums of i.i.d. uniforms give the Irwin–Hall distribution; the mean gives the Bates distribution
3) Formal Definition#
For \(a<b\):
3.1 PDF#
[
f(x\mid a,b) = egin{cases}
rac{1}{b-a}, & a \le x \le b
0, & ext{otherwise.}
\end{cases}
]
3.2 CDF#
[
F(x\mid a,b) = \mathbb{P}(X\le x) = egin{cases}
0, & x < a
rac{x-a}{b-a}, & a \le x \le b
1, & x > b.
\end{cases}
]
Because this is a continuous distribution, what happens at single points like \(x=a\) or \(x=b\) does not affect probabilities (those points have probability 0).
def uniform_pdf(x: np.ndarray, a: float, b: float) -> np.ndarray:
Uniform(a,b) PDF (vectorized).
if not (a < b):
raise ValueError("Require a < b")
x = np.asarray(x, dtype=float)
pdf = np.zeros_like(x, dtype=float)
inside = (a <= x) & (x <= b)
pdf[inside] = 1.0 / (b - a)
return pdf
def uniform_cdf(x: np.ndarray, a: float, b: float) -> np.ndarray:
Uniform(a,b) CDF (vectorized).
if not (a < b):
raise ValueError("Require a < b")
x = np.asarray(x, dtype=float)
return np.where(
x < a,
0.0,
np.where(x > b, 1.0, (x - a) / (b - a)),
)
def uniform_logpdf(x: np.ndarray, a: float, b: float) -> np.ndarray:
Uniform(a,b) log-PDF (vectorized).
if not (a < b):
raise ValueError("Require a < b")
x = np.asarray(x, dtype=float)
logpdf = np.full_like(x, -np.inf, dtype=float)
inside = (a <= x) & (x <= b)
logpdf[inside] = -np.log(b - a)
return logpdf
# Quick sanity check
xs = np.array([-1.0, 0.0, 0.5, 1.0, 2.0])
a, b = 0.0, 1.0
print("pdf:", uniform_pdf(xs, a, b))
print("cdf:", uniform_cdf(xs, a, b))
Cell In[2], line 2
Uniform(a,b) PDF (vectorized).
^
SyntaxError: invalid syntax
4) Moments & Properties#
Let \(X \sim \mathrm{Uniform}(a,b)\) and define the width \(w = b-a > 0\).
Moments#
Mean: [\mathbb{E}[X] = rac{a+b}{2}.]
Variance: [\mathrm{Var}(X) = rac{(b-a)^2}{12} = rac{w^2}{12}.]
Skewness: \(0\) (symmetric around the midpoint)
(Excess) kurtosis: \(- frac{6}{5}\) (thinner tails than a normal)
MGF and characteristic function#
MGF (all real \(t\)): [ M_X(t)=\mathbb{E}[e^{tX}] = egin{cases} rac{e^{tb}-e^{ta}}{t(b-a)}, & t e 0
1, & t=0. \end{cases} ]Characteristic function: [ arphi_X(t)=\mathbb{E}[e^{itX}] = rac{e^{itb}-e^{ita}}{it(b-a)}\quad (t e 0),\qquad arphi_X(0)=1. ]
Entropy (differential, in nats)#
[ H(X) = \ln(b-a) = \ln w. ]
Other notable properties#
Maximum entropy on \([a,b]\)
Affine invariance: if \(Y=cX+d\) with \(c>0\), then \(Y\sim\mathrm{Uniform}(ca+d, cb+d)\)
def uniform_mean(a: float, b: float) -> float:
if not (a < b):
raise ValueError("Require a < b")
return 0.5 * (a + b)
def uniform_var(a: float, b: float) -> float:
if not (a < b):
raise ValueError("Require a < b")
w = b - a
return (w * w) / 12.0
def uniform_mgf(t: np.ndarray, a: float, b: float) -> np.ndarray:
MGF using a numerically stable expm1 form.
if not (a < b):
raise ValueError("Require a < b")
t = np.asarray(t, dtype=float)
w = b - a
out = np.empty_like(t, dtype=float)
near0 = np.isclose(t, 0.0)
out[near0] = 1.0
tt = t[~near0]
out[~near0] = np.exp(tt * a) * np.expm1(tt * w) / (tt * w)
return out
def uniform_cf(t: np.ndarray, a: float, b: float) -> np.ndarray:
Characteristic function.
if not (a < b):
raise ValueError("Require a < b")
t = np.asarray(t, dtype=float)
w = b - a
out = np.empty_like(t, dtype=complex)
near0 = np.isclose(t, 0.0)
out[near0] = 1.0 + 0.0j
tt = t[~near0]
out[~near0] = np.exp(1j * tt * a) * np.expm1(1j * tt * w) / (1j * tt * w)
return out
def uniform_entropy(a: float, b: float) -> float:
if not (a < b):
raise ValueError("Require a < b")
return float(np.log(b - a))
a, b = -2.0, 3.0
n = 200_000
x = a + (b - a) * rng.random(n) # NumPy-only sampling
mu_hat = float(np.mean(x))
var_hat = float(np.var(x))
centered = x - mu_hat
skew_hat = float(np.mean(centered**3) / (var_hat ** 1.5))
exkurt_hat = float(np.mean(centered**4) / (var_hat**2) - 3.0)
print("theory mean:", uniform_mean(a, b), " sample:", mu_hat)
print("theory var :", uniform_var(a, b), " sample:", var_hat)
print("theory skew:", 0.0, " sample:", skew_hat)
print("theory ex-kurt:", -6/5, " sample:", exkurt_hat)
print("entropy (nats):", uniform_entropy(a, b))
# MGF check at a few t values
for t0 in [0.0, 0.2, -0.3]:
mgf_mc = float(np.mean(np.exp(t0 * x)))
mgf_th = float(uniform_mgf(np.array([t0]), a, b)[0])
print(f"t={t0:+.1f} mgf theory={mgf_th:.6f} mc={mgf_mc:.6f}")
Cell In[3], line 15
MGF using a numerically stable expm1 form.
^
SyntaxError: invalid syntax
5) Parameter Interpretation#
The parameters are literal bounds:
\(a\) is the lower limit; \(b\) is the upper limit.
The distribution is flat on \([a,b]\) with height \(1/(b-a)\).
Useful derived quantities:
Midpoint \(m = (a+b)/2\) sets the location (the mean).
Width \(w = b-a\) controls dispersion and uncertainty:
variance grows like \(w^2/12\)
entropy grows like \(\ln w\)
Changing \((a,b)\) only shifts and stretches the interval; it does not change the “shape” (it always remains a rectangle).
intervals = [(-1, 1), (0, 1), (0, 3)]
xs = np.linspace(-2.5, 3.5, 600)
fig = go.Figure()
for a, b in intervals:
fig.add_trace(
go.Scatter(
x=xs,
y=uniform_pdf(xs, a, b),
mode="lines",
name=f"a={a}, b={b}",
)
)
fig.update_layout(
title="Uniform PDF for different intervals",
xaxis_title="x",
yaxis_title="f(x)",
)
fig.show()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 9
4 fig = go.Figure()
5 for a, b in intervals:
6 fig.add_trace(
7 go.Scatter(
8 x=xs,
----> 9 y=uniform_pdf(xs, a, b),
10 mode="lines",
11 name=f"a={a}, b={b}",
12 )
13 )
15 fig.update_layout(
16 title="Uniform PDF for different intervals",
17 xaxis_title="x",
18 yaxis_title="f(x)",
19 )
20 fig.show()
NameError: name 'uniform_pdf' is not defined
6) Derivations#
6.1 Expectation#
Using \(f(x)=1/(b-a)\) on \([a,b]\):
[ \mathbb{E}[X] = \int_a^b x, rac{1}{b-a},dx = rac{1}{b-a}\left[ rac{x^2}{2} ight]_a^b = rac{b^2-a^2}{2(b-a)} = rac{a+b}{2}. ]
6.2 Variance#
First compute \(\mathbb{E}[X^2]\):
[ \mathbb{E}[X^2] = \int_a^b x^2, rac{1}{b-a},dx = rac{1}{b-a}\left[ rac{x^3}{3} ight]_a^b = rac{b^3-a^3}{3(b-a)} = rac{a^2+ab+b^2}{3}. ]
Then [ \mathrm{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = rac{a^2+ab+b^2}{3} - \left( rac{a+b}{2} ight)^2 = rac{(b-a)^2}{12}. ]
6.3 Likelihood and MLE#
For i.i.d. data \(x_1,\dots,x_n\):
[ L(a,b) = \prod_{i=1}^n f(x_i\mid a,b) = \left( rac{1}{b-a} ight)^n \mathbf{1}{a \le x_{(1)},; b \ge x_{(n)}}, ]
where \(x_{(1)}=\min_i x_i\) and \(x_{(n)}=\max_i x_i\).
So the log-likelihood (when the indicator is 1) is [ \ell(a,b) = -n\ln(b-a). ]
To maximize it you want \(b-a\) as small as possible while still containing the data, giving the MLE: [ \hat a = x_{(1)},\qquad \hat b = x_{(n)}. ]
This also explains why fit for the uniform distribution is extremely sensitive to outliers: the likelihood depends only on the min and max.
# Demonstrate the MLE geometry
true_a, true_b = 2.0, 5.0
x = true_a + (true_b - true_a) * rng.random(200)
a_hat = float(np.min(x))
b_hat = float(np.max(x))
print("true (a,b):", (true_a, true_b))
print("MLE (a,b):", (a_hat, b_hat))
# Compare to SciPy's fit (maps to loc, scale)
loc_hat, scale_hat = stats.uniform.fit(x)
print("scipy fit loc, scale:", (float(loc_hat), float(scale_hat)))
print("scipy fit a,b:", (float(loc_hat), float(loc_hat + scale_hat)))
true (a,b): (2.0, 5.0)
MLE (a,b): (2.011202726156228, 4.9865008503031785)
scipy fit loc, scale: (2.011202726156228, 2.9752981241469505)
scipy fit a,b: (2.011202726156228, 4.9865008503031785)
7) Sampling & Simulation (NumPy-only)#
Inverse transform sampling#
The CDF on \([a,b]\) is \(F(x)=(x-a)/(b-a)\). If \(U\sim\mathrm{Uniform}(0,1)\) and we set \(U = F(X)\), we get:
[ U = rac{X-a}{b-a}\quad\Rightarrow\quad X = a + (b-a)U. ]
Algorithm:
draw \(U\) uniformly on \([0,1)\)
return \(X=a+(b-a)U\)
rng.random(size) gives samples in \([0,1)\), which is perfect for continuous sampling (endpoint inclusion is a probability-zero event).
def sample_uniform(a: float, b: float, size: int | tuple[int, ...], rng: np.random.Generator) -> np.ndarray:
NumPy-only sampler for Uniform(a,b).
if not (a < b):
raise ValueError("Require a < b")
return a + (b - a) * rng.random(size)
a, b = -1.0, 2.0
x = sample_uniform(a, b, size=10_000, rng=rng)
print("sample mean:", float(np.mean(x)), " theory:", uniform_mean(a, b))
print("sample var :", float(np.var(x)), " theory:", uniform_var(a, b))
Cell In[6], line 2
NumPy-only sampler for Uniform(a,b).
^
SyntaxError: invalid syntax
8) Visualization#
We’ll visualize:
the PDF (flat “rectangle”)
the CDF (a linear ramp from 0 to 1)
a Monte Carlo histogram compared to the analytic PDF
a, b = 0.0, 1.5
xs = np.linspace(-0.5, 2.0, 800)
# PDF
fig_pdf = go.Figure(
data=[go.Scatter(x=xs, y=uniform_pdf(xs, a, b), mode="lines", name="PDF")]
)
fig_pdf.update_layout(title="Uniform PDF", xaxis_title="x", yaxis_title="f(x)")
fig_pdf.show()
# CDF
fig_cdf = go.Figure(
data=[go.Scatter(x=xs, y=uniform_cdf(xs, a, b), mode="lines", name="CDF")]
)
fig_cdf.update_layout(title="Uniform CDF", xaxis_title="x", yaxis_title="F(x)")
fig_cdf.show()
# Monte Carlo samples
n = 8_000
samples = sample_uniform(a, b, size=n, rng=rng)
hist = px.histogram(samples, nbins=40, histnorm="probability density", title="Monte Carlo samples")
hist.add_trace(go.Scatter(x=xs, y=uniform_pdf(xs, a, b), mode="lines", name="PDF (theory)"))
hist.update_layout(xaxis_title="x", yaxis_title="density")
hist.show()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 6
2 xs = np.linspace(-0.5, 2.0, 800)
4 # PDF
5 fig_pdf = go.Figure(
----> 6 data=[go.Scatter(x=xs, y=uniform_pdf(xs, a, b), mode="lines", name="PDF")]
7 )
8 fig_pdf.update_layout(title="Uniform PDF", xaxis_title="x", yaxis_title="f(x)")
9 fig_pdf.show()
NameError: name 'uniform_pdf' is not defined
9) SciPy Integration (scipy.stats.uniform)#
SciPy parameterizes the uniform as:
[ X \sim exttt{stats.uniform}( ext{loc}, ext{scale}) \quad\Longleftrightarrow\quad X \sim \mathrm{Uniform}(a,b); ext{with}; a= ext{loc},; b= ext{loc}+ ext{scale}. ]
Common methods:
pdf(x),cdf(x)rvs(size, random_state=...)fit(data)(MLE forlocandscale)
a, b = -2.0, 1.0
rv = stats.uniform(loc=a, scale=b - a)
xs = np.linspace(-3.0, 2.0, 400)
print("pdf at 0:", float(rv.pdf(0.0)))
print("cdf at 0:", float(rv.cdf(0.0)))
# Sampling
s = rv.rvs(size=5, random_state=SEED)
print("rvs:", s)
# Fitting
data = rv.rvs(size=300, random_state=123)
loc_hat, scale_hat = stats.uniform.fit(data)
print("fit loc, scale:", (float(loc_hat), float(scale_hat)))
print("fit interval :", (float(loc_hat), float(loc_hat + scale_hat)))
# Visual comparison: analytic vs SciPy
fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=uniform_pdf(xs, a, b), mode="lines", name="PDF (ours)"))
fig.add_trace(go.Scatter(x=xs, y=rv.pdf(xs), mode="lines", name="PDF (SciPy)", line=dict(dash="dash")))
fig.update_layout(title="PDF: our implementation vs SciPy", xaxis_title="x", yaxis_title="f(x)")
fig.show()
pdf at 0: 0.3333333333333333
cdf at 0: 0.6666666666666666
rvs: [-1.7711 0.3398 -0.6848 0.1704 0.934 ]
fit loc, scale: (-1.9919358062770378, 2.980194788380514)
fit interval : (-1.9919358062770378, 0.9882589821034764)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 21
19 # Visual comparison: analytic vs SciPy
20 fig = go.Figure()
---> 21 fig.add_trace(go.Scatter(x=xs, y=uniform_pdf(xs, a, b), mode="lines", name="PDF (ours)"))
22 fig.add_trace(go.Scatter(x=xs, y=rv.pdf(xs), mode="lines", name="PDF (SciPy)", line=dict(dash="dash")))
23 fig.update_layout(title="PDF: our implementation vs SciPy", xaxis_title="x", yaxis_title="f(x)")
NameError: name 'uniform_pdf' is not defined
10) Statistical Use Cases#
10.1 Hypothesis testing#
P-values under a true null: if \(H_0\) is true and the test is calibrated, then \(p \sim \mathrm{Uniform}(0,1)\).
Testing for uniformity: the Kolmogorov–Smirnov test compares the empirical CDF to \(F(x)=x\) on \([0,1]\).
10.2 Bayesian modeling#
Bounded priors: \( heta\sim\mathrm{Uniform}(a,b)\) is a simple prior when \( heta\) is known to be in \([a,b]\).
Caution: “uniform” is not invariant to reparameterization (uniform in \( heta\) is not uniform in \(\log heta\)). For scale parameters, it’s common to consider log-uniform or Jeffreys-type priors instead.
10.3 Generative modeling#
Uniform noise is a common base distribution.
With a transform \(X = g(U)\) you can generate complex distributions; inverse-CDF sampling is the special case \(g=F^{-1}\).
# P-values are Uniform(0,1) under a true null (illustration)
# We'll repeatedly test whether N(0,1) data has mean 0 (true).
m = 10_000
n = 25
x = rng.normal(loc=0.0, scale=1.0, size=(m, n))
res = stats.ttest_1samp(x, popmean=0.0, axis=1)
pvals = res.pvalue
# Visualize histogram against the Uniform(0,1) PDF (which equals 1 on [0,1])
fig = px.histogram(pvals, nbins=40, histnorm="probability density", title="Histogram of p-values under H0")
fig.add_hline(y=1.0, line_dash="dash", line_color="black", annotation_text="Uniform(0,1) density = 1")
fig.update_layout(xaxis_title="p-value", yaxis_title="density")
fig.show()
# KS test for uniformity
ks = stats.kstest(pvals, "uniform")
print("KS statistic:", float(ks.statistic), " p-value:", float(ks.pvalue))
KS statistic: 0.012517525083212355 p-value: 0.0863724270399534
11) Pitfalls#
Continuous vs discrete: “uniform distribution” can mean a discrete uniform on \(\{1,\dots,k\}\) or a continuous uniform on \([a,b]\).
Invalid parameters: must have \(a<b\) (SciPy:
scale>0). The case \(a=b\) is a degenerate distribution (a point mass), not a continuous uniform.Outliers dominate
fit: MLE uses only the sample min and max.Not automatically “uninformative”: a uniform prior depends on the chosen parameterization.
Numerical issues: when \(b-a\) is extremely small, the density \(1/(b-a)\) is huge; log-likelihood can be very large and optimization can be unstable.
12) Summary#
\(X\sim\mathrm{Uniform}(a,b)\) is the canonical continuous distribution on a bounded interval.
PDF is constant \(1/(b-a)\) on \([a,b]\); CDF is a linear ramp.
Mean \((a+b)/2\), variance \((b-a)^2/12\), entropy \(\ln(b-a)\).
Sampling is just scale + shift of \(U\sim\mathrm{Uniform}(0,1)\).
The MLE for \((a,b)\) is \((\min x_i, \max x_i)\), which makes fitting sensitive to outliers.